Overview
Brought to you by YData
Dataset statistics
| Number of variables | 19 |
|---|---|
| Number of observations | 19459 |
| Missing cells | 11482 |
| Missing cells (%) | 3.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 8.9 MiB |
| Average record size in memory | 482.0 B |
Variable types
| Categorical | 8 |
|---|---|
| Text | 3 |
| Numeric | 8 |
followers is highly overall correlated with following and 6 other fields | High correlation |
following is highly overall correlated with followers and 4 other fields | High correlation |
label is highly overall correlated with text_bot_count | High correlation |
log_followers is highly overall correlated with followers and 6 other fields | High correlation |
log_following is highly overall correlated with followers and 4 other fields | High correlation |
log_public_gists is highly overall correlated with followers and 4 other fields | High correlation |
log_public_repos is highly overall correlated with followers and 6 other fields | High correlation |
public_gists is highly overall correlated with followers and 4 other fields | High correlation |
public_repos is highly overall correlated with followers and 6 other fields | High correlation |
text_bot_count is highly overall correlated with label and 1 other fields | High correlation |
type is highly overall correlated with text_bot_count | High correlation |
label is highly imbalanced (67.2%) | Imbalance |
type is highly imbalanced (92.8%) | Imbalance |
site_admin is highly imbalanced (95.7%) | Imbalance |
text_bot_count is highly imbalanced (93.5%) | Imbalance |
bio has 10758 (55.3%) missing values | Missing |
public_repos is highly skewed (γ1 = 54.52274234) | Skewed |
public_gists is highly skewed (γ1 = 69.95022964) | Skewed |
followers is highly skewed (γ1 = 32.0725488) | Skewed |
following is highly skewed (γ1 = 30.4418986) | Skewed |
public_repos has 966 (5.0%) zeros | Zeros |
public_gists has 7790 (40.0%) zeros | Zeros |
followers has 1436 (7.4%) zeros | Zeros |
following has 5901 (30.3%) zeros | Zeros |
log_public_repos has 966 (5.0%) zeros | Zeros |
log_public_gists has 7790 (40.0%) zeros | Zeros |
log_followers has 1436 (7.4%) zeros | Zeros |
log_following has 5901 (30.3%) zeros | Zeros |
Reproduction
| Analysis started | 2025-01-04 14:34:33.728733 |
|---|---|
| Analysis finished | 2025-01-04 14:34:42.078433 |
| Duration | 8.35 seconds |
| Software version | ydata-profiling vv4.12.1 |
| Download configuration | config.json |
Variables
label
Categorical
High correlation  Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.7 MiB |
| Human | |
|---|---|
| Bot | 1172 |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 4.8795416 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Human |
|---|---|
| 2nd row | Human |
| 3rd row | Human |
| 4th row | Bot |
| 5th row | Human |
Common Values
| Value | Count | Frequency (%) |
| Human | 18287 | |
| Bot | 1172 | 6.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| human | 18287 | |
| bot | 1172 | 6.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| H | 18287 | |
| u | 18287 | |
| m | 18287 | |
| a | 18287 | |
| n | 18287 | |
| B | 1172 | 1.2% |
| o | 1172 | 1.2% |
| t | 1172 | 1.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 75492 | |
| Uppercase Letter | 19459 | 20.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| u | 18287 | |
| m | 18287 | |
| a | 18287 | |
| n | 18287 | |
| o | 1172 | 1.6% |
| t | 1172 | 1.6% |
Uppercase Letter
| Value | Count | Frequency (%) |
| H | 18287 | |
| B | 1172 | 6.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 94951 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| H | 18287 | |
| u | 18287 | |
| m | 18287 | |
| a | 18287 | |
| n | 18287 | |
| B | 1172 | 1.2% |
| o | 1172 | 1.2% |
| t | 1172 | 1.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 94951 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| H | 18287 | |
| u | 18287 | |
| m | 18287 | |
| a | 18287 | |
| n | 18287 | |
| B | 1172 | 1.2% |
| o | 1172 | 1.2% |
| t | 1172 | 1.2% |
type
Categorical
High correlation  Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| 1 | |
|---|---|
| 0 | 170 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 19289 | |
| 0 | 170 | 0.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 19289 | |
| 0 | 170 | 0.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 19289 | |
| 0 | 170 | 0.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 19459 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 19289 | |
| 0 | 170 | 0.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 19459 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 19289 | |
| 0 | 170 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 19459 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 19289 | |
| 0 | 170 | 0.9% |
site_admin
Categorical
Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| 0 | |
|---|---|
| 1 | 90 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 19369 | |
| 1 | 90 | 0.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 19369 | |
| 1 | 90 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 19369 | |
| 1 | 90 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 19459 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 19369 | |
| 1 | 90 | 0.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 19459 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 19369 | |
| 1 | 90 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 19459 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 19369 | |
| 1 | 90 | 0.5% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 10635 | |
| 0 | 8824 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 10635 | |
| 0 | 8824 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 10635 | |
| 0 | 8824 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 19459 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 10635 | |
| 0 | 8824 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 19459 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 10635 | |
| 0 | 8824 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 19459 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 10635 | |
| 0 | 8824 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 11084 | |
| 1 | 8375 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 11084 | |
| 1 | 8375 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 11084 | |
| 1 | 8375 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 19459 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 11084 | |
| 1 | 8375 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 19459 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 11084 | |
| 1 | 8375 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 19459 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 11084 | |
| 1 | 8375 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 12502 | |
| 0 | 6957 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 12502 | |
| 0 | 6957 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 12502 | |
| 0 | 6957 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 19459 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 12502 | |
| 0 | 6957 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 19459 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 12502 | |
| 0 | 6957 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 19459 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 12502 | |
| 0 | 6957 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 16228 | |
| 1 | 3231 | 16.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 16228 | |
| 1 | 3231 | 16.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 16228 | |
| 1 | 3231 | 16.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 19459 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 16228 | |
| 1 | 3231 | 16.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 19459 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 16228 | |
| 1 | 3231 | 16.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 19459 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 16228 | |
| 1 | 3231 | 16.6% |
bio
Text
Missing 
| Distinct | 8511 |
|---|---|
| Distinct (%) | 97.8% |
| Missing | 10758 |
| Missing (%) | 55.3% |
| Memory size | 2.2 MiB |
Length
| Max length | 3013 |
|---|---|
| Median length | 311 |
| Mean length | 66.395817 |
| Min length | 1 |
Unique
| Unique | 8446 ? |
|---|---|
| Unique (%) | 97.1% |
Sample
| 1st row | I just press the buttons randomly, and the program evolves... |
|---|---|
| 2nd row | Time is unimportant, only life important. |
| 3rd row | Done studying. Need challenges. |
| 4th row | Administrator of MOONGIFT that is introducing open source software everyday to Japanese engineers since 2004. |
| 5th row | Senior Software Engineer at Google, working on Certificate Transparency and generalized transparency. |
| Value | Count | Frequency (%) |
| 3017 | 3.9% | |
| and | 2499 | 3.2% |
| engineer | 1568 | 2.0% |
| software | 1493 | 1.9% |
| of | 1469 | 1.9% |
| at | 1372 | 1.8% |
| developer | 1210 | 1.6% |
| the | 1073 | 1.4% |
| a | 1031 | 1.3% |
| i | 1021 | 1.3% |
| Other values (15629) | 62170 |
Most occurring characters
| Value | Count | Frequency (%) |
| 69483 | 12.0% | |
| e | 50777 | 8.8% |
| a | 32167 | 5.6% |
| o | 32151 | 5.6% |
| r | 31682 | 5.5% |
| n | 31423 | 5.4% |
| t | 30985 | 5.4% |
| i | 28328 | 4.9% |
| s | 20313 | 3.5% |
| l | 15440 | 2.7% |
| Other values (185) | 234961 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 398013 | |
| Space Separator | 69488 | 12.0% |
| Uppercase Letter | 49508 | 8.6% |
| Other Punctuation | 29722 | 5.1% |
| Decimal Number | 14833 | 2.6% |
| Dash Punctuation | 3735 | 0.6% |
| Math Symbol | 3721 | 0.6% |
| Control | 3047 | 0.5% |
| Open Punctuation | 2229 | 0.4% |
| Private Use | 1459 | 0.3% |
| Other values (7) | 1955 | 0.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 50777 | |
| a | 32167 | 8.1% |
| o | 32151 | 8.1% |
| r | 31682 | 8.0% |
| n | 31423 | 7.9% |
| t | 30985 | 7.8% |
| i | 28328 | 7.1% |
| s | 20313 | 5.1% |
| l | 15440 | 3.9% |
| c | 14098 | 3.5% |
| Other values (49) | 110649 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 5632 | 11.4% |
| C | 3765 | 7.6% |
| E | 2996 | 6.1% |
| I | 2911 | 5.9% |
| T | 2822 | 5.7% |
| P | 2821 | 5.7% |
| D | 2722 | 5.5% |
| A | 2711 | 5.5% |
| F | 2513 | 5.1% |
| M | 2315 | 4.7% |
| Other values (42) | 18300 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 9327 | |
| . | 7615 | |
| @ | 4145 | |
| : | 2362 | 7.9% |
| / | 1983 | 6.7% |
| ? | 853 | 2.9% |
| ' | 744 | 2.5% |
| & | 655 | 2.2% |
| ! | 380 | 1.3% |
| # | 305 | 1.0% |
| Other values (13) | 1353 | 4.6% |
Math Symbol
| Value | Count | Frequency (%) |
| | | 1127 | |
| + | 904 | |
| ∏ | 384 | 10.3% |
| ¬ | 162 | 4.4% |
| ∑ | 155 | 4.2% |
| ∫ | 148 | 4.0% |
| √ | 119 | 3.2% |
| ≠ | 119 | 3.2% |
| ∞ | 118 | 3.2% |
| ± | 88 | 2.4% |
| Other values (9) | 397 | 10.7% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 4832 | |
| 2 | 2341 | |
| 1 | 2299 | |
| 3 | 1208 | 8.1% |
| 5 | 825 | 5.6% |
| 4 | 822 | 5.5% |
| 9 | 737 | 5.0% |
| 6 | 641 | 4.3% |
| 8 | 591 | 4.0% |
| 7 | 537 | 3.6% |
Open Punctuation
| Value | Count | Frequency (%) |
| ‚ | 1226 | |
| ( | 530 | |
| „ | 397 | 17.8% |
| [ | 57 | 2.6% |
| { | 19 | 0.9% |
Other Symbol
| Value | Count | Frequency (%) |
| ® | 192 | |
| ° | 100 | |
| © | 97 | |
| ™ | 78 | |
| ◊ | 14 | 2.9% |
Currency Symbol
| Value | Count | Frequency (%) |
| ¥ | 74 | |
| £ | 57 | |
| ¢ | 40 | |
| $ | 13 | 7.1% |
Modifier Symbol
| Value | Count | Frequency (%) |
| ¨ | 62 | |
| ´ | 53 | |
| ` | 14 | 10.6% |
| ^ | 3 | 2.3% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 3491 | |
| – | 157 | 4.2% |
| — | 87 | 2.3% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 549 | |
| ] | 58 | 9.3% |
| } | 18 | 2.9% |
Space Separator
| Value | Count | Frequency (%) |
| 69483 | ||
| 5 | < 0.1% |
Other Letter
| Value | Count | Frequency (%) |
| º | 206 | |
| ª | 176 |
Control
| Value | Count | Frequency (%) |
| 3047 |
Private Use
| Value | Count | Frequency (%) |
| | 1459 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 145 |
Final Punctuation
| Value | Count | Frequency (%) |
| » | 6 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 447652 | |
| Common | 128408 | 22.2% |
| Unknown | 1459 | 0.3% |
| Greek | 191 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 50777 | 11.3% |
| a | 32167 | 7.2% |
| o | 32151 | 7.2% |
| r | 31682 | 7.1% |
| n | 31423 | 7.0% |
| t | 30985 | 6.9% |
| i | 28328 | 6.3% |
| s | 20313 | 4.5% |
| l | 15440 | 3.4% |
| c | 14098 | 3.1% |
| Other values (100) | 160288 |
Common
| Value | Count | Frequency (%) |
| 69483 | ||
| , | 9327 | 7.3% |
| . | 7615 | 5.9% |
| 0 | 4832 | 3.8% |
| @ | 4145 | 3.2% |
| - | 3491 | 2.7% |
| 3047 | 2.4% | |
| : | 2362 | 1.8% |
| 2 | 2341 | 1.8% |
| 1 | 2299 | 1.8% |
| Other values (72) | 19466 | 15.2% |
Greek
| Value | Count | Frequency (%) |
| Ω | 112 | |
| π | 79 |
Unknown
| Value | Count | Frequency (%) |
| | 1459 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 560027 | |
| None | 12464 | 2.2% |
| Punctuation | 2400 | 0.4% |
| PUA | 1459 | 0.3% |
| Math Operators | 1268 | 0.2% |
| Letterlike Symbols | 78 | < 0.1% |
| Geometric Shapes | 14 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 69483 | 12.4% | |
| e | 50777 | 9.1% |
| a | 32167 | 5.7% |
| o | 32151 | 5.7% |
| r | 31682 | 5.7% |
| n | 31423 | 5.6% |
| t | 30985 | 5.5% |
| i | 28328 | 5.1% |
| s | 20313 | 3.6% |
| l | 15440 | 2.8% |
| Other values (86) | 217278 |
None
| Value | Count | Frequency (%) |
| ü | 1509 | 12.1% |
| Ä | 992 | 8.0% |
| Â | 438 | 3.5% |
| ç | 364 | 2.9% |
| ñ | 364 | 2.9% |
| Ô | 350 | 2.8% |
| è | 337 | 2.7% |
| Å | 323 | 2.6% |
| Ê | 274 | 2.2% |
| Á | 264 | 2.1% |
| Other values (66) | 7249 |
PUA
| Value | Count | Frequency (%) |
| | 1459 |
Punctuation
| Value | Count | Frequency (%) |
| ‚ | 1226 | |
| „ | 397 | 16.5% |
| ‰ | 239 | 10.0% |
| – | 157 | 6.5% |
| † | 147 | 6.1% |
| • | 134 | 5.6% |
| — | 87 | 3.6% |
| ‡ | 8 | 0.3% |
| … | 5 | 0.2% |
Math Operators
| Value | Count | Frequency (%) |
| ∏ | 384 | |
| ∑ | 155 | |
| ∫ | 148 | 11.7% |
| √ | 119 | 9.4% |
| ≠ | 119 | 9.4% |
| ∞ | 118 | 9.3% |
| ≥ | 85 | 6.7% |
| ∂ | 70 | 5.5% |
| ≤ | 57 | 4.5% |
| ≈ | 12 | 0.9% |
Letterlike Symbols
| Value | Count | Frequency (%) |
| ™ | 78 |
Geometric Shapes
| Value | Count | Frequency (%) |
| ◊ | 14 |
public_repos
Real number (ℝ)
High correlation  Skewed  Zeros 
| Distinct | 667 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 46 |
| Missing (%) | 0.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 83.189461 |
| Minimum | 0 |
|---|---|
| Maximum | 50000 |
| Zeros | 966 |
| Zeros (%) | 5.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 820.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 11 |
| median | 34 |
| Q3 | 82 |
| 95-th percentile | 248 |
| Maximum | 50000 |
| Range | 50000 |
| Interquartile range (IQR) | 71 |
Descriptive statistics
| Standard deviation | 575.11725 |
|---|---|
| Coefficient of variation (CV) | 6.9133427 |
| Kurtosis | 3754.597 |
| Mean | 83.189461 |
| Median Absolute Deviation (MAD) | 28 |
| Skewness | 54.522742 |
| Sum | 1614957 |
| Variance | 330759.85 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 966 | 5.0% |
| 1 | 556 | 2.9% |
| 2 | 468 | 2.4% |
| 3 | 394 | 2.0% |
| 4 | 377 | 1.9% |
| 6 | 362 | 1.9% |
| 5 | 350 | 1.8% |
| 7 | 328 | 1.7% |
| 9 | 306 | 1.6% |
| 8 | 300 | 1.5% |
| Other values (657) | 15006 |
| Value | Count | Frequency (%) |
| 0 | 966 | |
| 1 | 556 | |
| 2 | 468 | |
| 3 | 394 | |
| 4 | 377 | 1.9% |
| 5 | 350 | 1.8% |
| 6 | 362 | 1.9% |
| 7 | 328 | 1.7% |
| 8 | 300 | 1.5% |
| 9 | 306 | 1.6% |
| Value | Count | Frequency (%) |
| 50000 | 1 | |
| 27746 | 1 | |
| 26360 | 1 | |
| 22618 | 1 | |
| 20693 | 1 | |
| 17425 | 1 | |
| 16985 | 1 | |
| 16839 | 1 | |
| 9554 | 1 | |
| 7068 | 1 |
public_gists
Real number (ℝ)
High correlation  Skewed  Zeros 
| Distinct | 390 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 38 |
| Missing (%) | 0.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 28.5216 |
| Minimum | 0 |
|---|---|
| Maximum | 55781 |
| Zeros | 7790 |
| Zeros (%) | 40.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 820.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 2 |
| Q3 | 10 |
| 95-th percentile | 70 |
| Maximum | 55781 |
| Range | 55781 |
| Interquartile range (IQR) | 10 |
Descriptive statistics
| Standard deviation | 653.4023 |
|---|---|
| Coefficient of variation (CV) | 22.909034 |
| Kurtosis | 5437.9943 |
| Mean | 28.5216 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 69.95023 |
| Sum | 553918 |
| Variance | 426934.57 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 7790 | |
| 1 | 1818 | 9.3% |
| 2 | 1118 | 5.7% |
| 3 | 804 | 4.1% |
| 4 | 652 | 3.4% |
| 5 | 610 | 3.1% |
| 6 | 473 | 2.4% |
| 7 | 398 | 2.0% |
| 9 | 323 | 1.7% |
| 8 | 311 | 1.6% |
| Other values (380) | 5124 |
| Value | Count | Frequency (%) |
| 0 | 7790 | |
| 1 | 1818 | 9.3% |
| 2 | 1118 | 5.7% |
| 3 | 804 | 4.1% |
| 4 | 652 | 3.4% |
| 5 | 610 | 3.1% |
| 6 | 473 | 2.4% |
| 7 | 398 | 2.0% |
| 8 | 311 | 1.6% |
| 9 | 323 | 1.7% |
| Value | Count | Frequency (%) |
| 55781 | 1 | |
| 53660 | 1 | |
| 28943 | 1 | |
| 26879 | 1 | |
| 15482 | 1 | |
| 12328 | 1 | |
| 10604 | 1 | |
| 8924 | 1 | |
| 4461 | 1 | |
| 4163 | 1 |
followers
Real number (ℝ)
High correlation  Skewed  Zeros 
| Distinct | 1564 |
|---|---|
| Distinct (%) | 8.0% |
| Missing | 30 |
| Missing (%) | 0.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 244.56544 |
| Minimum | 0 |
|---|---|
| Maximum | 95752 |
| Zeros | 1436 |
| Zeros (%) | 7.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 820.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 7 |
| median | 33 |
| Q3 | 124 |
| 95-th percentile | 822.6 |
| Maximum | 95752 |
| Range | 95752 |
| Interquartile range (IQR) | 117 |
Descriptive statistics
| Standard deviation | 1555.0957 |
|---|---|
| Coefficient of variation (CV) | 6.3586076 |
| Kurtosis | 1525.2067 |
| Mean | 244.56544 |
| Median Absolute Deviation (MAD) | 31 |
| Skewness | 32.072549 |
| Sum | 4751662 |
| Variance | 2418322.6 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1436 | 7.4% |
| 1 | 796 | 4.1% |
| 2 | 616 | 3.2% |
| 3 | 504 | 2.6% |
| 4 | 443 | 2.3% |
| 5 | 409 | 2.1% |
| 6 | 396 | 2.0% |
| 7 | 342 | 1.8% |
| 8 | 337 | 1.7% |
| 9 | 308 | 1.6% |
| Other values (1554) | 13842 |
| Value | Count | Frequency (%) |
| 0 | 1436 | |
| 1 | 796 | |
| 2 | 616 | |
| 3 | 504 | 2.6% |
| 4 | 443 | 2.3% |
| 5 | 409 | 2.1% |
| 6 | 396 | 2.0% |
| 7 | 342 | 1.8% |
| 8 | 337 | 1.7% |
| 9 | 308 | 1.6% |
| Value | Count | Frequency (%) |
| 95752 | 1 | |
| 84979 | 1 | |
| 66203 | 1 | |
| 58452 | 1 | |
| 31120 | 1 | |
| 30287 | 1 | |
| 29719 | 1 | |
| 29414 | 1 | |
| 28411 | 1 | |
| 27775 | 1 |
following
Real number (ℝ)
High correlation  Skewed  Zeros 
| Distinct | 607 |
|---|---|
| Distinct (%) | 3.1% |
| Missing | 151 |
| Missing (%) | 0.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 42.653045 |
| Minimum | 0 |
|---|---|
| Maximum | 16741 |
| Zeros | 5901 |
| Zeros (%) | 30.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 820.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 4 |
| Q3 | 22 |
| 95-th percentile | 146 |
| Maximum | 16741 |
| Range | 16741 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 309.26905 |
|---|---|
| Coefficient of variation (CV) | 7.2508081 |
| Kurtosis | 1228.9191 |
| Mean | 42.653045 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 30.441899 |
| Sum | 823545 |
| Variance | 95647.344 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 5901 | |
| 1 | 1709 | 8.8% |
| 2 | 1073 | 5.5% |
| 3 | 774 | 4.0% |
| 4 | 596 | 3.1% |
| 5 | 518 | 2.7% |
| 6 | 470 | 2.4% |
| 7 | 399 | 2.1% |
| 8 | 357 | 1.8% |
| 9 | 317 | 1.6% |
| Other values (597) | 7194 |
| Value | Count | Frequency (%) |
| 0 | 5901 | |
| 1 | 1709 | 8.8% |
| 2 | 1073 | 5.5% |
| 3 | 774 | 4.0% |
| 4 | 596 | 3.1% |
| 5 | 518 | 2.7% |
| 6 | 470 | 2.4% |
| 7 | 399 | 2.1% |
| 8 | 357 | 1.8% |
| 9 | 317 | 1.6% |
| Value | Count | Frequency (%) |
| 16741 | 1 | |
| 15931 | 1 | |
| 11921 | 1 | |
| 10268 | 1 | |
| 9720 | 1 | |
| 9686 | 1 | |
| 9532 | 1 | |
| 9367 | 1 | |
| 7374 | 1 | |
| 6207 | 1 |
created_at
Text
| Distinct | 19434 |
|---|---|
| Distinct (%) | > 99.9% |
| Missing | 23 |
| Missing (%) | 0.1% |
| Memory size | 2.0 MiB |
Length
| Max length | 27 |
|---|---|
| Median length | 25 |
| Mean length | 24.974532 |
| Min length | 1 |
Unique
| Unique | 19432 ? |
|---|---|
| Unique (%) | > 99.9% |
Sample
| 1st row | 2011-09-26 17:27:03+00:00 |
|---|---|
| 2nd row | 2015-06-29 10:12:46+00:00 |
| 3rd row | 2008-08-29 16:20:03+00:00 |
| 4th row | 2014-05-20 18:43:09+00:00 |
| 5th row | 2012-08-16 14:19:13+00:00 |
| Value | Count | Frequency (%) |
| 2012-07-05 | 18 | < 0.1% |
| 2014-10-08 | 15 | < 0.1% |
| 2013-06-10 | 15 | < 0.1% |
| 2017-06-09 | 14 | < 0.1% |
| 2013-01-25 | 14 | < 0.1% |
| 2014-05-16 | 14 | < 0.1% |
| 2013-01-15 | 13 | < 0.1% |
| 2014-04-15 | 13 | < 0.1% |
| 2012-06-18 | 13 | < 0.1% |
| 2013-05-14 | 13 | < 0.1% |
| Other values (21976) | 38709 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 144298 | |
| : | 58239 | |
| 1 | 57550 | 11.9% |
| 2 | 50192 | 10.3% |
| - | 38826 | 8.0% |
| 3 | 19515 | 4.0% |
| 19417 | 4.0% | |
| + | 19413 | 4.0% |
| 4 | 17545 | 3.6% |
| 5 | 17230 | 3.5% |
| Other values (25) | 43180 | 8.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 349479 | |
| Other Punctuation | 58239 | 12.0% |
| Dash Punctuation | 38826 | 8.0% |
| Space Separator | 19417 | 4.0% |
| Math Symbol | 19413 | 4.0% |
| Lowercase Letter | 27 | < 0.1% |
| Uppercase Letter | 4 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 4 | |
| e | 3 | |
| a | 2 | 7.4% |
| i | 2 | 7.4% |
| p | 2 | 7.4% |
| u | 2 | 7.4% |
| s | 2 | 7.4% |
| k | 1 | 3.7% |
| c | 1 | 3.7% |
| t | 1 | 3.7% |
| Other values (7) | 7 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 144298 | |
| 1 | 57550 | 16.5% |
| 2 | 50192 | 14.4% |
| 3 | 19515 | 5.6% |
| 4 | 17545 | 5.0% |
| 5 | 17230 | 4.9% |
| 9 | 11170 | 3.2% |
| 8 | 10933 | 3.1% |
| 7 | 10541 | 3.0% |
| 6 | 10505 | 3.0% |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 1 | |
| P | 1 | |
| C | 1 | |
| E | 1 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 58239 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 38826 |
Space Separator
| Value | Count | Frequency (%) |
| 19417 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 19413 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 485374 | |
| Latin | 31 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 4 | 12.9% |
| e | 3 | 9.7% |
| a | 2 | 6.5% |
| i | 2 | 6.5% |
| p | 2 | 6.5% |
| u | 2 | 6.5% |
| s | 2 | 6.5% |
| k | 1 | 3.2% |
| S | 1 | 3.2% |
| c | 1 | 3.2% |
| Other values (11) | 11 |
Common
| Value | Count | Frequency (%) |
| 0 | 144298 | |
| : | 58239 | |
| 1 | 57550 | 11.9% |
| 2 | 50192 | 10.3% |
| - | 38826 | 8.0% |
| 3 | 19515 | 4.0% |
| 19417 | 4.0% | |
| + | 19413 | 4.0% |
| 4 | 17545 | 3.6% |
| 5 | 17230 | 3.5% |
| Other values (4) | 43149 | 8.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 485405 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 144298 | |
| : | 58239 | |
| 1 | 57550 | 11.9% |
| 2 | 50192 | 10.3% |
| - | 38826 | 8.0% |
| 3 | 19515 | 4.0% |
| 19417 | 4.0% | |
| + | 19413 | 4.0% |
| 4 | 17545 | 3.6% |
| 5 | 17230 | 3.5% |
| Other values (25) | 43180 | 8.9% |
updated_at
Text
| Distinct | 19181 |
|---|---|
| Distinct (%) | 98.7% |
| Missing | 23 |
| Missing (%) | 0.1% |
| Memory size | 2.0 MiB |
Length
| Max length | 25 |
|---|---|
| Median length | 25 |
| Mean length | 24.829337 |
| Min length | 1 |
Unique
| Unique | 19056 ? |
|---|---|
| Unique (%) | 98.0% |
Sample
| 1st row | 2023-10-13 11:21:10+00:00 |
|---|---|
| 2nd row | 2023-10-07 06:26:14+00:00 |
| 3rd row | 2023-10-02 02:11:21+00:00 |
| 4th row | 2023-10-12 12:54:59+00:00 |
| 5th row | 2023-10-06 11:58:41+00:00 |
| Value | Count | Frequency (%) |
| 2023-10-13 | 837 | 2.2% |
| 2023-10-11 | 813 | 2.1% |
| 2023-10-12 | 809 | 2.1% |
| 2023-10-10 | 716 | 1.8% |
| 2023-10-09 | 653 | 1.7% |
| 2023-10-04 | 530 | 1.4% |
| 2023-10-03 | 474 | 1.2% |
| 2023-10-05 | 463 | 1.2% |
| 2023-10-06 | 460 | 1.2% |
| 2023-09-28 | 433 | 1.1% |
| Other values (17627) | 32546 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 142136 | |
| 2 | 62535 | |
| : | 57888 | |
| 1 | 43291 | 9.0% |
| - | 38592 | 8.0% |
| 3 | 33261 | 6.9% |
| 19300 | 4.0% | |
| + | 19296 | 4.0% |
| 4 | 13659 | 2.8% |
| 5 | 13559 | 2.8% |
| Other values (24) | 39066 | 8.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 347479 | |
| Other Punctuation | 57891 | 12.0% |
| Dash Punctuation | 38592 | 8.0% |
| Space Separator | 19300 | 4.0% |
| Math Symbol | 19296 | 4.0% |
| Lowercase Letter | 21 | < 0.1% |
| Uppercase Letter | 3 | < 0.1% |
| Close Punctuation | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 3 | |
| o | 2 | |
| t | 2 | |
| l | 2 | |
| e | 2 | |
| v | 2 | |
| y | 1 | 4.8% |
| g | 1 | 4.8% |
| d | 1 | 4.8% |
| i | 1 | 4.8% |
| Other values (4) | 4 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 142136 | |
| 2 | 62535 | |
| 1 | 43291 | 12.5% |
| 3 | 33261 | 9.6% |
| 4 | 13659 | 3.9% |
| 5 | 13559 | 3.9% |
| 9 | 13040 | 3.8% |
| 8 | 9649 | 2.8% |
| 7 | 8491 | 2.4% |
| 6 | 7858 | 2.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 57888 | |
| . | 2 | < 0.1% |
| " | 1 | < 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| P | 1 | |
| J | 1 | |
| M | 1 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 38592 |
Space Separator
| Value | Count | Frequency (%) |
| 19300 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 19296 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 482559 | |
| Latin | 24 | < 0.1% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 142136 | |
| 2 | 62535 | |
| : | 57888 | |
| 1 | 43291 | 9.0% |
| - | 38592 | 8.0% |
| 3 | 33261 | 6.9% |
| 19300 | 4.0% | |
| + | 19296 | 4.0% |
| 4 | 13659 | 2.8% |
| 5 | 13559 | 2.8% |
| Other values (7) | 39042 | 8.1% |
Latin
| Value | Count | Frequency (%) |
| a | 3 | |
| o | 2 | 8.3% |
| t | 2 | 8.3% |
| l | 2 | 8.3% |
| e | 2 | 8.3% |
| v | 2 | 8.3% |
| P | 1 | 4.2% |
| y | 1 | 4.2% |
| g | 1 | 4.2% |
| d | 1 | 4.2% |
| Other values (7) | 7 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 482583 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 142136 | |
| 2 | 62535 | |
| : | 57888 | |
| 1 | 43291 | 9.0% |
| - | 38592 | 8.0% |
| 3 | 33261 | 6.9% |
| 19300 | 4.0% | |
| + | 19296 | 4.0% |
| 4 | 13659 | 2.8% |
| 5 | 13559 | 2.8% |
| Other values (24) | 39066 | 8.1% |
text_bot_count
Categorical
High correlation  Imbalance 
| Distinct | 28 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 148 |
| Missing (%) | 0.8% |
| Memory size | 1.6 MiB |
| 0 | |
|---|---|
| 1 | 420 |
| 2 | 245 |
| 3 | 73 |
| 4 | 9 |
| Other values (23) | 27 |
Length
| Max length | 25 |
|---|---|
| Median length | 1 |
| Mean length | 1.0205064 |
| Min length | 1 |
Unique
| Unique | 22 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 18537 | |
| 1 | 420 | 2.2% |
| 2 | 245 | 1.3% |
| 3 | 73 | 0.4% |
| 4 | 9 | < 0.1% |
| 5 | 5 | < 0.1% |
| 246 | 1 | < 0.1% |
| 35 | 1 | < 0.1% |
| 2014-02-27 00:18:12+00:00 | 1 | < 0.1% |
| 2023-10-07 17:17:23+00:00 | 1 | < 0.1% |
| Other values (18) | 18 | 0.1% |
| (Missing) | 148 | 0.8% |
Length
| Value | Count | Frequency (%) |
| 0 | 18537 | |
| 1 | 420 | 2.2% |
| 2 | 245 | 1.3% |
| 3 | 73 | 0.4% |
| 4 | 9 | < 0.1% |
| 5 | 5 | < 0.1% |
| 2023-10-06 | 2 | < 0.1% |
| 35 | 1 | < 0.1% |
| 2014-02-27 | 1 | < 0.1% |
| 00:18:12+00:00 | 1 | < 0.1% |
| Other values (33) | 33 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 18655 | |
| 1 | 468 | 2.4% |
| 2 | 288 | 1.5% |
| 3 | 95 | 0.5% |
| : | 48 | 0.2% |
| - | 32 | 0.2% |
| 4 | 22 | 0.1% |
| 5 | 19 | 0.1% |
| 17 | 0.1% | |
| + | 16 | 0.1% |
| Other values (8) | 47 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 19590 | |
| Other Punctuation | 48 | 0.2% |
| Dash Punctuation | 32 | 0.2% |
| Space Separator | 17 | 0.1% |
| Math Symbol | 16 | 0.1% |
| Lowercase Letter | 3 | < 0.1% |
| Uppercase Letter | 1 | < 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 18655 | |
| 1 | 468 | 2.4% |
| 2 | 288 | 1.5% |
| 3 | 95 | 0.5% |
| 4 | 22 | 0.1% |
| 5 | 19 | 0.1% |
| 7 | 12 | 0.1% |
| 8 | 11 | 0.1% |
| 9 | 11 | 0.1% |
| 6 | 9 | < 0.1% |
Lowercase Letter
| Value | Count | Frequency (%) |
| u | 1 | |
| s | 1 | |
| t | 1 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 48 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 32 |
Space Separator
| Value | Count | Frequency (%) |
| 17 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 16 |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 19703 | |
| Latin | 4 | < 0.1% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 18655 | |
| 1 | 468 | 2.4% |
| 2 | 288 | 1.5% |
| 3 | 95 | 0.5% |
| : | 48 | 0.2% |
| - | 32 | 0.2% |
| 4 | 22 | 0.1% |
| 5 | 19 | 0.1% |
| 17 | 0.1% | |
| + | 16 | 0.1% |
| Other values (4) | 43 | 0.2% |
Latin
| Value | Count | Frequency (%) |
| R | 1 | |
| u | 1 | |
| s | 1 | |
| t | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 19707 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 18655 | |
| 1 | 468 | 2.4% |
| 2 | 288 | 1.5% |
| 3 | 95 | 0.5% |
| : | 48 | 0.2% |
| - | 32 | 0.2% |
| 4 | 22 | 0.1% |
| 5 | 19 | 0.1% |
| 17 | 0.1% | |
| + | 16 | 0.1% |
| Other values (8) | 47 | 0.2% |
log_public_repos
Real number (ℝ)
High correlation  Zeros 
| Distinct | 667 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 46 |
| Missing (%) | 0.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.3764337 |
| Minimum | 0 |
|---|---|
| Maximum | 10.819798 |
| Zeros | 966 |
| Zeros (%) | 5.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 820.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.69314718 |
| Q1 | 2.4849066 |
| median | 3.5553481 |
| Q3 | 4.4188406 |
| 95-th percentile | 5.5174529 |
| Maximum | 10.819798 |
| Range | 10.819798 |
| Interquartile range (IQR) | 1.933934 |
Descriptive statistics
| Standard deviation | 1.4886652 |
|---|---|
| Coefficient of variation (CV) | 0.44089868 |
| Kurtosis | 0.025800282 |
| Mean | 3.3764337 |
| Median Absolute Deviation (MAD) | 0.93328831 |
| Skewness | -0.38278236 |
| Sum | 65546.708 |
| Variance | 2.216124 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 966 | 5.0% |
| 0.6931471806 | 556 | 2.9% |
| 1.098612289 | 468 | 2.4% |
| 1.386294361 | 394 | 2.0% |
| 1.609437912 | 377 | 1.9% |
| 1.945910149 | 362 | 1.9% |
| 1.791759469 | 350 | 1.8% |
| 2.079441542 | 328 | 1.7% |
| 2.302585093 | 306 | 1.6% |
| 2.197224577 | 300 | 1.5% |
| Other values (657) | 15006 |
| Value | Count | Frequency (%) |
| 0 | 966 | |
| 0.6931471806 | 556 | |
| 1.098612289 | 468 | |
| 1.386294361 | 394 | |
| 1.609437912 | 377 | 1.9% |
| 1.791759469 | 350 | 1.8% |
| 1.945910149 | 362 | 1.9% |
| 2.079441542 | 328 | 1.7% |
| 2.197224577 | 300 | 1.5% |
| 2.302585093 | 306 | 1.6% |
| Value | Count | Frequency (%) |
| 10.81979828 | 1 | |
| 10.23088301 | 1 | |
| 10.17964092 | 1 | |
| 10.02654554 | 1 | |
| 9.937599082 | 1 | |
| 9.765718623 | 1 | |
| 9.740144754 | 1 | |
| 9.731512288 | 1 | |
| 9.164819857 | 1 | |
| 8.863474306 | 1 |
log_public_gists
Real number (ℝ)
High correlation  Zeros 
| Distinct | 390 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 38 |
| Missing (%) | 0.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.3872489 |
| Minimum | 0 |
|---|---|
| Maximum | 10.929207 |
| Zeros | 7790 |
| Zeros (%) | 40.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 820.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1.0986123 |
| Q3 | 2.3978953 |
| 95-th percentile | 4.2626799 |
| Maximum | 10.929207 |
| Range | 10.929207 |
| Interquartile range (IQR) | 2.3978953 |
Descriptive statistics
| Standard deviation | 1.5188762 |
|---|---|
| Coefficient of variation (CV) | 1.0948837 |
| Kurtosis | 0.40792863 |
| Mean | 1.3872489 |
| Median Absolute Deviation (MAD) | 1.0986123 |
| Skewness | 0.96025069 |
| Sum | 26941.762 |
| Variance | 2.306985 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 7790 | |
| 0.6931471806 | 1818 | 9.3% |
| 1.098612289 | 1118 | 5.7% |
| 1.386294361 | 804 | 4.1% |
| 1.609437912 | 652 | 3.4% |
| 1.791759469 | 610 | 3.1% |
| 1.945910149 | 473 | 2.4% |
| 2.079441542 | 398 | 2.0% |
| 2.302585093 | 323 | 1.7% |
| 2.197224577 | 311 | 1.6% |
| Other values (380) | 5124 |
| Value | Count | Frequency (%) |
| 0 | 7790 | |
| 0.6931471806 | 1818 | 9.3% |
| 1.098612289 | 1118 | 5.7% |
| 1.386294361 | 804 | 4.1% |
| 1.609437912 | 652 | 3.4% |
| 1.791759469 | 610 | 3.1% |
| 1.945910149 | 473 | 2.4% |
| 2.079441542 | 398 | 2.0% |
| 2.197224577 | 311 | 1.6% |
| 2.302585093 | 323 | 1.7% |
| Value | Count | Frequency (%) |
| 10.92920652 | 1 | |
| 10.89044176 | 1 | |
| 10.27311821 | 1 | |
| 10.19913779 | 1 | |
| 9.647497927 | 1 | |
| 9.41970949 | 1 | |
| 9.269080867 | 1 | |
| 9.096611607 | 1 | |
| 8.403352375 | 1 | |
| 8.33423143 | 1 |
log_followers
Real number (ℝ)
High correlation  Zeros 
| Distinct | 1564 |
|---|---|
| Distinct (%) | 8.0% |
| Missing | 30 |
| Missing (%) | 0.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.490025 |
| Minimum | 0 |
|---|---|
| Maximum | 11.469527 |
| Zeros | 1436 |
| Zeros (%) | 7.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 820.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2.0794415 |
| median | 3.5263605 |
| Q3 | 4.8283137 |
| 95-th percentile | 6.7136848 |
| Maximum | 11.469527 |
| Range | 11.469527 |
| Interquartile range (IQR) | 2.7488722 |
Descriptive statistics
| Standard deviation | 1.9544797 |
|---|---|
| Coefficient of variation (CV) | 0.56001883 |
| Kurtosis | -0.28827067 |
| Mean | 3.490025 |
| Median Absolute Deviation (MAD) | 1.3291359 |
| Skewness | 0.13372599 |
| Sum | 67807.696 |
| Variance | 3.819991 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1436 | 7.4% |
| 0.6931471806 | 796 | 4.1% |
| 1.098612289 | 616 | 3.2% |
| 1.386294361 | 504 | 2.6% |
| 1.609437912 | 443 | 2.3% |
| 1.791759469 | 409 | 2.1% |
| 1.945910149 | 396 | 2.0% |
| 2.079441542 | 342 | 1.8% |
| 2.197224577 | 337 | 1.7% |
| 2.302585093 | 308 | 1.6% |
| Other values (1554) | 13842 |
| Value | Count | Frequency (%) |
| 0 | 1436 | |
| 0.6931471806 | 796 | |
| 1.098612289 | 616 | |
| 1.386294361 | 504 | 2.6% |
| 1.609437912 | 443 | 2.3% |
| 1.791759469 | 409 | 2.1% |
| 1.945910149 | 396 | 2.0% |
| 2.079441542 | 342 | 1.8% |
| 2.197224577 | 337 | 1.7% |
| 2.302585093 | 308 | 1.6% |
| Value | Count | Frequency (%) |
| 11.46952724 | 1 | |
| 11.35017121 | 1 | |
| 11.10049616 | 1 | |
| 10.97597829 | 1 | |
| 10.34563811 | 1 | |
| 10.31850687 | 1 | |
| 10.2995755 | 1 | |
| 10.28926003 | 1 | |
| 10.25456687 | 1 | |
| 10.23192762 | 1 |
log_following
Real number (ℝ)
High correlation  Zeros 
| Distinct | 607 |
|---|---|
| Distinct (%) | 3.1% |
| Missing | 151 |
| Missing (%) | 0.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.8481194 |
| Minimum | 0 |
|---|---|
| Maximum | 9.7256758 |
| Zeros | 5901 |
| Zeros (%) | 30.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 820.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1.6094379 |
| Q3 | 3.1354942 |
| 95-th percentile | 4.9904326 |
| Maximum | 9.7256758 |
| Range | 9.7256758 |
| Interquartile range (IQR) | 3.1354942 |
Descriptive statistics
| Standard deviation | 1.7374152 |
|---|---|
| Coefficient of variation (CV) | 0.94009897 |
| Kurtosis | -0.25776983 |
| Mean | 1.8481194 |
| Median Absolute Deviation (MAD) | 1.6094379 |
| Skewness | 0.68501686 |
| Sum | 35683.49 |
| Variance | 3.0186115 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 5901 | |
| 0.6931471806 | 1709 | 8.8% |
| 1.098612289 | 1073 | 5.5% |
| 1.386294361 | 774 | 4.0% |
| 1.609437912 | 596 | 3.1% |
| 1.791759469 | 518 | 2.7% |
| 1.945910149 | 470 | 2.4% |
| 2.079441542 | 399 | 2.1% |
| 2.197224577 | 357 | 1.8% |
| 2.302585093 | 317 | 1.6% |
| Other values (597) | 7194 |
| Value | Count | Frequency (%) |
| 0 | 5901 | |
| 0.6931471806 | 1709 | 8.8% |
| 1.098612289 | 1073 | 5.5% |
| 1.386294361 | 774 | 4.0% |
| 1.609437912 | 596 | 3.1% |
| 1.791759469 | 518 | 2.7% |
| 1.945910149 | 470 | 2.4% |
| 2.079441542 | 399 | 2.1% |
| 2.197224577 | 357 | 1.8% |
| 2.302585093 | 317 | 1.6% |
| Value | Count | Frequency (%) |
| 9.725675811 | 1 | |
| 9.676084944 | 1 | |
| 9.386140712 | 1 | |
| 9.236884927 | 1 | |
| 9.182043773 | 1 | |
| 9.178540059 | 1 | |
| 9.162514742 | 1 | |
| 9.145054905 | 1 | |
| 8.905851181 | 1 | |
| 8.733594062 | 1 |
Interactions
Correlations
| blog | company | followers | following | hireable | label | location | log_followers | log_following | log_public_gists | log_public_repos | public_gists | public_repos | site_admin | text_bot_count | type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| blog | 1.000 | 0.258 | 0.047 | 0.036 | 0.218 | 0.023 | 0.370 | 0.424 | 0.358 | 0.359 | 0.360 | 0.017 | 0.000 | 0.005 | 0.065 | 0.081 |
| company | 0.258 | 1.000 | 0.017 | 0.005 | 0.058 | 0.070 | 0.393 | 0.257 | 0.195 | 0.181 | 0.196 | 0.000 | 0.009 | 0.025 | 0.070 | 0.102 |
| followers | 0.047 | 0.017 | 1.000 | 0.535 | 0.000 | 0.000 | 0.020 | 1.000 | 0.535 | 0.592 | 0.649 | 0.592 | 0.649 | 0.000 | 0.000 | 0.000 |
| following | 0.036 | 0.005 | 0.535 | 1.000 | 0.050 | 0.000 | 0.008 | 0.535 | 1.000 | 0.439 | 0.536 | 0.439 | 0.536 | 0.000 | 0.088 | 0.000 |
| hireable | 0.218 | 0.058 | 0.000 | 0.050 | 1.000 | 0.056 | 0.177 | 0.213 | 0.266 | 0.199 | 0.226 | 0.000 | 0.015 | 0.013 | 0.061 | 0.041 |
| label | 0.023 | 0.070 | 0.000 | 0.000 | 0.056 | 1.000 | 0.128 | 0.163 | 0.164 | 0.139 | 0.362 | 0.035 | 0.019 | 0.006 | 0.578 | 0.370 |
| location | 0.370 | 0.393 | 0.020 | 0.008 | 0.177 | 0.128 | 1.000 | 0.394 | 0.357 | 0.290 | 0.349 | 0.000 | 0.000 | 0.019 | 0.129 | 0.125 |
| log_followers | 0.424 | 0.257 | 1.000 | 0.535 | 0.213 | 0.163 | 0.394 | 1.000 | 0.535 | 0.592 | 0.649 | 0.592 | 0.649 | 0.075 | 0.059 | 0.226 |
| log_following | 0.358 | 0.195 | 0.535 | 1.000 | 0.266 | 0.164 | 0.357 | 0.535 | 1.000 | 0.439 | 0.536 | 0.439 | 0.536 | 0.000 | 0.094 | 0.115 |
| log_public_gists | 0.359 | 0.181 | 0.592 | 0.439 | 0.199 | 0.139 | 0.290 | 0.592 | 0.439 | 1.000 | 0.619 | 1.000 | 0.619 | 0.038 | 0.060 | 0.092 |
| log_public_repos | 0.360 | 0.196 | 0.649 | 0.536 | 0.226 | 0.362 | 0.349 | 0.649 | 0.536 | 0.619 | 1.000 | 0.619 | 1.000 | 0.019 | 0.179 | 0.322 |
| public_gists | 0.017 | 0.000 | 0.592 | 0.439 | 0.000 | 0.035 | 0.000 | 0.592 | 0.439 | 1.000 | 0.619 | 1.000 | 0.619 | 0.000 | 0.037 | 0.000 |
| public_repos | 0.000 | 0.009 | 0.649 | 0.536 | 0.015 | 0.019 | 0.000 | 0.649 | 0.536 | 0.619 | 1.000 | 0.619 | 1.000 | 0.000 | 0.037 | 0.000 |
| site_admin | 0.005 | 0.025 | 0.000 | 0.000 | 0.013 | 0.006 | 0.019 | 0.075 | 0.000 | 0.038 | 0.019 | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 |
| text_bot_count | 0.065 | 0.070 | 0.000 | 0.088 | 0.061 | 0.578 | 0.129 | 0.059 | 0.094 | 0.060 | 0.179 | 0.037 | 0.037 | 0.000 | 1.000 | 0.511 |
| type | 0.081 | 0.102 | 0.000 | 0.000 | 0.041 | 0.370 | 0.125 | 0.226 | 0.115 | 0.092 | 0.322 | 0.000 | 0.000 | 0.000 | 0.511 | 1.000 |
Missing values
Sample
| label | type | site_admin | company | blog | location | hireable | bio | public_repos | public_gists | followers | following | created_at | updated_at | text_bot_count | log_public_repos | log_public_gists | log_followers | log_following | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Human | 1 | 0 | 0 | 0 | 0 | 0 | NaN | 26.0 | 1.0 | 5.0 | 1.0 | 2011-09-26 17:27:03+00:00 | 2023-10-13 11:21:10+00:00 | 0 | 3.295837 | 0.693147 | 1.791759 | 0.693147 |
| 1 | Human | 1 | 0 | 0 | 1 | 0 | 1 | I just press the buttons randomly, and the program evolves... | 30.0 | 3.0 | 9.0 | 6.0 | 2015-06-29 10:12:46+00:00 | 2023-10-07 06:26:14+00:00 | 0 | 3.433987 | 1.386294 | 2.302585 | 1.945910 |
| 2 | Human | 1 | 0 | 1 | 1 | 1 | 1 | Time is unimportant,\nonly life important. | 103.0 | 49.0 | 1212.0 | 221.0 | 2008-08-29 16:20:03+00:00 | 2023-10-02 02:11:21+00:00 | 0 | 4.644391 | 3.912023 | 7.100852 | 5.402677 |
| 3 | Bot | 1 | 0 | 0 | 0 | 1 | 0 | NaN | 49.0 | 0.0 | 84.0 | 2.0 | 2014-05-20 18:43:09+00:00 | 2023-10-12 12:54:59+00:00 | 0 | 3.912023 | 0.000000 | 4.442651 | 1.098612 |
| 4 | Human | 1 | 0 | 0 | 0 | 0 | 1 | NaN | 11.0 | 1.0 | 6.0 | 2.0 | 2012-08-16 14:19:13+00:00 | 2023-10-06 11:58:41+00:00 | 0 | 2.484907 | 0.693147 | 1.945910 | 1.098612 |
| 5 | Human | 1 | 0 | 1 | 1 | 1 | 0 | Done studying. Need challenges. | 56.0 | 1.0 | 22.0 | 7.0 | 2017-04-11 14:08:07+00:00 | 2023-10-11 05:59:26+00:00 | 0 | 4.043051 | 0.693147 | 3.135494 | 2.079442 |
| 6 | Human | 1 | 0 | 1 | 1 | 1 | 1 | Administrator of MOONGIFT that is introducing open source software everyday to Japanese engineers since 2004. | 277.0 | 1139.0 | 63.0 | 16.0 | 2008-04-07 22:22:22+00:00 | 2023-09-27 09:04:56+00:00 | 0 | 5.627621 | 7.038784 | 4.158883 | 2.833213 |
| 7 | Human | 1 | 0 | 1 | 0 | 1 | 0 | Senior Software Engineer at Google, working on Certificate Transparency and generalized transparency. | 37.0 | 1.0 | 22.0 | 0.0 | 2012-01-19 21:57:07+00:00 | 2023-08-07 16:06:34+00:00 | 0 | 3.637586 | 0.693147 | 3.135494 | 0.000000 |
| 8 | Human | 1 | 0 | 0 | 0 | 0 | 0 | NaN | 27.0 | 2.0 | 37.0 | 596.0 | 2019-12-24 20:04:33+00:00 | 2023-10-12 11:55:01+00:00 | 0 | 3.332205 | 1.098612 | 3.637586 | 6.391917 |
| 9 | Human | 1 | 0 | 1 | 1 | 1 | 0 | Hi | 42.0 | 9.0 | 14.0 | 2.0 | 2013-07-23 23:29:34+00:00 | 2023-10-09 20:47:05+00:00 | 0 | 3.761200 | 2.302585 | 2.708050 | 1.098612 |
| label | type | site_admin | company | blog | location | hireable | bio | public_repos | public_gists | followers | following | created_at | updated_at | text_bot_count | log_public_repos | log_public_gists | log_followers | log_following | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19503 | Human | 1 | 0 | 1 | 0 | 1 | 0 | NaN | 30.0 | 0.0 | 10.0 | 11.0 | 2016-09-10 09:45:00+00:00 | 2023-10-06 11:30:51+00:00 | 0 | 3.433987 | 0.000000 | 2.397895 | 2.484907 |
| 19504 | Human | 1 | 0 | 0 | 0 | 1 | 1 | NaN | 37.0 | 19.0 | 91.0 | 6.0 | 2012-04-19 03:27:14+00:00 | 2023-10-07 18:13:52+00:00 | 0 | 3.637586 | 2.995732 | 4.521789 | 1.945910 |
| 19505 | Bot | 1 | 0 | 0 | 0 | 0 | 0 | I am the bot account of @alvaroaleman | 1.0 | 0.0 | 0.0 | 0.0 | 2018-12-15 19:55:31+00:00 | 2021-07-27 14:14:25+00:00 | 2 | 0.693147 | 0.000000 | 0.000000 | 0.000000 |
| 19506 | Human | 1 | 0 | 0 | 0 | 0 | 0 | NaN | 3.0 | 0.0 | 1.0 | 0.0 | 2013-11-10 16:05:37+00:00 | 2023-08-31 14:26:08+00:00 | 2 | 1.386294 | 0.000000 | 0.693147 | 0.000000 |
| 19507 | Human | 1 | 0 | 0 | 0 | 0 | 0 | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 2020-10-01 18:30:32+00:00 | 2020-12-29 19:45:12+00:00 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 19508 | Bot | 1 | 0 | 1 | 1 | 1 | 0 | Tony came to Linux in 1994 and has never looked back. His entire professional career has been spent working with or on Linux. First as a systems administrator | 36.0 | 16.0 | 11.0 | 4.0 | 2014-07-02 23:27:34+00:00 | 2023-08-15 16:38:34+00:00 | 0 | 3.610918 | 2.833213 | 2.484907 | 1.609438 |
| 19509 | Human | 1 | 0 | 0 | 0 | 0 | 0 | NaN | 16.0 | 0.0 | 3.0 | 0.0 | 2017-12-06 21:56:31+00:00 | 2023-07-26 18:32:25+00:00 | 0 | 2.833213 | 0.000000 | 1.386294 | 0.000000 |
| 19510 | Human | 1 | 0 | 1 | 0 | 1 | 0 | Software engineer at RealTracs. | 13.0 | 0.0 | 10.0 | 1.0 | 2015-11-14 14:44:05+00:00 | 2022-08-23 21:09:49+00:00 | 0 | 2.639057 | 0.000000 | 2.397895 | 0.693147 |
| 19511 | Human | 1 | 0 | 1 | 0 | 0 | 0 | NaN | 7.0 | 0.0 | 2.0 | 0.0 | 2021-11-23 18:55:29+00:00 | 2023-10-06 22:50:45+00:00 | 0 | 2.079442 | 0.000000 | 1.098612 | 0.000000 |
| 19512 | Bot | 1 | 0 | 0 | 0 | 1 | 0 | NaN | 10.0 | 0.0 | 1.0 | 0.0 | 2016-04-22 22:11:59+00:00 | 2022-07-07 19:48:21+00:00 | 0 | 2.397895 | 0.000000 | 0.693147 | 0.000000 |